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Abstract 



The eigenvector corresponding to the second smallest eigenvalue of the Laplacian of a graph, known 
as the Fiedler vector, has a number of applications in areas that include matrix reordering, graph parti- 
tioning, protein analysis, data mining, machine learning, and web search. The computation of the Fiedler 
vector has been regarded as an expensive process as it involves solving a large eigenvalue problem. We 
present a novel and efficient parallel algorithm for computing the Fiedler vector of large graphs based on 
the Trace Minimization algorithm (Sameh, et.al). We compare the parallel performance of our method 
with a multilevel scheme, designed specifically for computing the Fiedler vector, which is implemented 
in routine MC73 JFiedler of the Harwell Subroutine Library (HSL). In addition, we compare the quality 
of the Fiedler vector for the application of weighted matrix reordering and provide a metric for measuring 
the quality of reordering. 

1 Introduction 

The second smallest eigenvalue and the corresponding eigenvector of the Laplacian of a graph have been 
used in a number of application areas including matrix reordering ifTTlfTOl lQni. graph partitioning |[T4l[T5l . 
machine learning [13], protein analysis and data mining ISKTSllHl, and web search [4J. The second smallest 
eigenvalue of the Laplacian of a graph is sometimes called the algebraic connectivity of the graph, and the 
corresponding eigenvector is known as the Fiedler vector, due to the pioneering work of Fiedler 131. 

For a given nxn sparse symmetric matrix A, or an undirected weighted graph with positive weights, 
one can form the weighted-Laplacian matrix, L„,, as follows: 



One can obtain the unweighted Laplacian by simply replacing each nonzero element of the matrix A by 
\. In this paper, we focus on the more general weighted case; the method we present is also applicable 
to the unweighted Laplacian. Since the Fiedler vector can be computed independently for disconnected 
graphs, we assume that the graph is connected. The eigenvalues of L^^. are = Ai < A2 < A3 < ... < X„. The 
eigenvector X2 corresponding to smallest nontrivial eigenvalue A2 is the sought Fiedler vector If the matrix, 
A, is nonsymmetric we use ([A| + |A^|)/2, instead. 

A state of the art multilevel solver [jTl called MC73_Fiedler for computing the Fiedler vector is imple- 
mented in the Harwell Subroutine Library (HSL) [61 . It uses a series of levels of coarser graphs where the 
eigenvalue problem corresponding to the coarsest level is solved via the Lanczos method for estimating the 
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Fiedler vector. The results are then prolongated to the finer graphs and Rayleigh Quotient Iterations (RQI) 
with shift and invert are used for refining the eigenvector. Linear systems encountered in RQI are solved via 
the SYMMLQ algorithm. We consider MC73_Fiedler as one of the best uniprocessor implementation for 
determining the Fiedler vector. 

In Section |2l We describe a novel parallel solver: TraceMin-Fiedler based on the Trace Minimization 
algorithm (TraceMin) ITtIITH, and present results comparing it to MC7 3 .Fiedler, in Section [3l Finally, in 
Section IH we compare the quality of the Fiedler vectors obtained by both methods for reordering sparse 
matrices. 



2 The TraceMin-Fiedler Algorithm 

We consider solving the standard symmetric eigenvalue problem 

l,x = Xx (2) 

where L denotes the weighted Laplacian, using the TraceMin scheme for obtaining the Fiedler vector. The 
basic TraceMin algorithm ifTTl IT6l can be summarized as follows. Let be an approximation of the 
eigenvectors corresponding to the p smallest eigenvalues such that XfLXk = and X^X^ = I, where 

(k) (k) (k)\ 

Lk = diag{p\ ,...,Pp ). The updated approximation is obtained by solving the minimization problem 

mintr(Xyt-AA:)^L(X^:-Ai:), subject to A[Xyt = 0. (3) 

This in turn leads to the need for solving a saddle point problem, in each iteration of the TraceMin algorithm, 
of the form 



" L X, " 




" A, " 




" LX, " 


_ X[ _ 




. N't . 








Where the Schur complement system {XfL, 'Xjt)Ni: = Xj^X^- needs to be solved. Once A^- and X<- are 
obtained (Xjt — A^^) is then used to obtain Xyt+i which forms the section 

Xl_^_iLXk+i = 'Lk+i,Xl_^_yXk+[ = I. (5) 

The TraceMin-Fiedler algorithm, which is based on the basic TraceMin algorithm, is given in Figure [T] 

The most time consuming part of the algorithm is solving the saddle-point problem in each outer 
TraceMin iteration. This involves, in turn, solving large sparse symmetric positive semi-definite systems 
of the form 

LW, = X, (6) 

using the Conjugate Gradient algorithm with a diagonal preconditioner in Figure |2] Our main enhancement 
of the basic TraceMin scheme are contained in step 8, solving systems involving the Laplacian, and step 7 
concerning the deflation process. In the TraceMin-Fiedler algorithm, not only is the coefficient matrix L is 
guaranteed to be symmetric positive semi-definite, but that its diagonal (the preconditioner) is guaranteed to 
have positive elements. On the other hand, in MC73 J'iedler there is no guarantee that the linear systems, 
arising in the RQI with shift and invert, are symmetric positive semi-definite with positive diagonal elements. 
Hence, MC73 Jiedler uses SYMMLQ without any preconditioning to solve linear systems in the Rayleigh 
Quotient Iterations. 
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Algorithm 1: 



Data: L is the nxn Laplacian matrix defined in Eqn.([Tl) , e^wf is the stopping criterion for the 1 1 . | |oo of 

the eigenvalue problem residual 
Result: X2 is the eigenvector corresponding to the second smallest eigenvalue of L 
pi — 2; q< — 3p ; 

^conv ^ 0> ^conv ^ [ ]' 

L^L+[|L|[^10-'2 
D i — the diagonal of L ; 
D < — the diagonal of L ; 
Xi < — rand{n,q); 
for k= 1,2,... maxJt do 

1. Orthonormalize X^; into V^:; 

2. Compute the interaction matrix H^- i — Vj^LVi:; 

3. Compute the eigendecomposition Hj^Y^^ = Yk^k of H^;. The eigenvalues £<- are arranged in 
ascending order and the eigenvectors are chosen to be orthogonal; 

4. Compute the corresponding Ritz vectors Xjt i — ykXk', 
Note that Xjt is a section, i.e. Xj^LXjt = lLk,^^k = I; 

5. Compute the relative residual | |LXjt — ^k^k\ |oo/| |L| |oo; 

6. Test for convergence: If the relative residual of an approximate eigenvector is less than Eout, 
move that vector from X^; to X„j„y and replace ricom by «conv + 1 increment. If «conv > P, stop; 

7. Deflate: If > 0,Xi ^ X^ - XconvO^Lm^k); 

8. if Uconv = then 

Solve the linear system LW^; = X^; approximately with relative residual e,„ via the PCG 
scheme using the diagonal preconditioner D; 

se 

Solve the linear system LW^ = Xk approximately with relative residual £,„ via the PCG 
scheme using the diagonal preconditioner D; 

9. Form the Schur complement Sk < — Xj^W^; 

10. Solve the linear system Sk^k = X/t for ; 

11. Update Xk+i < — X^ - = W^N^ ; 



Figure 1 : TraceMin-Fiedler algorithm. 
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Algorithm 2: 



Data: Lx = Z?, L is the nxn Laplacian matrix defined in Eqn.([T|l , e,„ is the stopping criterion for the 

[ I . I [oo of the relative residual, b is the right hand side, and M is the preconditioner 
Result: x is solution of the linear system 

Solve the preconditioned system {M~'^/'^LM^'^/^){M^/^x) = (M^^/^b); 

b=M-'^/^b; 
x={M^l^x); 

[0,..,0]^ ; 
f i — ^ - Lxo ; 

Pi)< — ^0 ; 

for k= 1,2,... maxJt do 
1 ai.i 

2. xk+\ i — xk + atPk; 

3. h+i < — h- oiki-Pk; 

4. if ||ri:+i||oo/||ro||oo < £,„ then 
L exit 



5. A 



flh ■ 

Pk+\ < — h+i +PkPk ; 



Figure 2: Preconditioned Conjugate Gradient Scheme for solving systems in the form L,x = b . 

We note, in TraceMin-Fiedler, that after the smallest eigenvector, which corresponds to the null space of 
L, has converged then in preconditioned CG in Figure |2l 

PlUk > 0. (7) 
Observing that v -Lb due to the deflation step, the proof is given below. 

Theorem 2.1 Let L be symmetric positive semidefinite such that Lv = (i.e.^{\S) = span[v\) and M = 
diag(L) ( => M^i/^LM-i/^M'/^y = ^ Lv = where v = M^l\ ) . 

The following statement is true for the preconditioned conjugate gradient method in Figure^ ifv^b 
then V _L p„ and v _L r„ 

Proof (by induction) 
• The base case: 

~ ~T~ 

V po = V ro 

= v'^(^ — Lxo),(notexo = 0) 

= v'^b 

= v^M'I^M-'l% 

= v^b 

= 
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• Inductive hypothesis: Assume that v -L pk and v _L r^- for n = k. 



• Inductive step: Then for step n 



k+l, 




v'^ {h - ocktpk) 

(by inductive hypothesis and v^L = 0) 



and 



v^Pk+ 1 = v'^ih+i+ PkPk ) 

= (by inductive hypothesis) 



Therefore, we do not need to use a diagonal perturbation after the smallest eigenvalue and the corresponding 
eigenvector have converged. 

We note that our algorithm can easily compute additional eigenvectors of the Laplacian matrix by setting 
p to be the number of desired smallest eigenpairs. 

Parallelism in the algorithm is achieved by partitioning all vectors, Xic,Vk,Wk and the coefficient matrix 
L by block rows where each MPI process contains one block. The matrix and the vectors are partitioned 
into blocks of roughly equal size. The most time consuming operation in Figure [T] is the solution of the 
linear systems involving L. The diagonal preconditioner does not require any communucations. The sparse 
matrix- vector multiplications does require communication, however, with the amount of communication de- 
termined by the sparsity structure of the matrix. Therefore, the overall scalability of the algorithm is problem 
dependent. In the implementation in this paper we only communicate the elements that are needed to com- 
plete the product via asynchronous point to point communication (i.e. using MPIJSEND and MPIJRECV). 
The remaining operations that require communication are the inner products that use MPLALLREDUCE 
with vectors of multiple columns. 

3 Numerical Results 

We implement the parallel TraceMin-Fiedler algorithm |[T2l in Figure [T] in parallel using MPI. We compare 
the parallel performance of MC7 3 .Fiedler with TraceMin-Fiedler using a cluster with Infiniband intercon- 
nection where each node consists of two quad-core Intel Xeon CPUs (X5560) running at 2.80GHz (8 cores 
per node). For both solvers we set the stopping tolerance for the o° — norm of the eigenvalue problem resid- 
ual to 10^^. In TraceMin-Fiedler we set the inner stopping criterion (relative residual norm for solving the 
linear systems using the preconditioned CG scheme) as £,„ = 10^^ * Eout, and the maximum number of the 
preconditioned CG iterations to be 30. For MC73 Jiedler, we use all the default parameters. 

The set of test matrices are obtained from the University of Florida (UF) Sparse Matrix Collection |[2l. 
A search for matrices in this collection which are square, real, and which are of order 2,000,000 < N < 
5,000,000 returns the four matrices listed in Table [T] If the adjacency graph of A has any disconnected 
single vertices, we remove them since those vertices are independent and have trivial solutions. We apply 
both MC73_Fiedler and TraceMin-Fiedler to the weighted Laplacian generated from the adjacency graph 
of the preprocessed matrix where the weights are the absolute values of matrix entries. After obtaining the 
Fiedler vector X2 produced by each algorithm, we compute the corresponding eigenvalue A2, 




(8) 
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Table 1: Matrix size (A'^), number of nonzeros (nnz), and type of test matrices. 



Matrix Group/Name 


N 


nnz 


symmetric 


application 


1. Raj at/raj at31 


4,690,002 


20,316,253 


no 


circuit simulation 


2. Schenk/nlpkkt 


3,542,400 


95,117,792 


yes 


nonlinear optimization 


3. Freescale/Freescalel 


3,428,755 


17,052,626 


no 


circuit simulation 


4. Zaoui/kktPower 


2,063,494 


12,771,361 


yes 


optimum power flow 



Table 2: Relative residuals ||La; — A;c||oo/||L||c» for TraceMin-Fiedler and MC73_Fiedler where Sout = 10 ^. 



Matrix/Cores 


TraceMin-Fiedler 

1 8 




16 




32 




MC73_Fiedler 
1 


rajatSl 


1.1 X 10"'^ 


1.1 X 10" 


12 


1.1 X 10" 


12 


1.1 X 10" 


12 


3.03 X 10"^ 


nlpkift 


9.1 X 10"'' 


9.1 X 10 


-6 


9.1 X 10 


-6 


9.1 X 10 


-6 


6.49 X lO-'^ 


Freescalel 


7.5 X 10^12 


7.5 X 10" 


12 


7.5 X 10" 


12 


7.5 X 10- 


12 


1.03 X 10"^ 


kktPower 


3.1 X 10^24 


3.1 X 10" 


24 


3.1 X 10" 


24 


3.1 X 10" 


24 


4.07 X lO"** 



We report the relative residuals | \Lx2 — ■^2.'C2||<=o/| |L||oo in Table|2] 

The total time required by TraceMin-Fiedler using 1, 2, and 4 nodes with 8 MPI processes, i.e. using 
8 cores, per node are presented in Table [3l We emphasize that the parallel scalability results for TraceMin- 
Fiedler is preliminary and that there is more room for improvement. Since MC73_Fiedler is purely sequential 
we have used it on a single core. The speed improvements realized by TraceMin-Fiedler on 1, 8, 16, and 
32 cores over MC73_Fiedler on a single core are depicted in Figure [3l with the actual solve times and the 
speed improvement values are given in Tables [3] and |4l Note that on 32 cores, our scheme realizes speed 
improvements over MC73_Fiedler that range between 4 and 641 for our four test matrices. 

Next, we compute the Fiedler vector of a symmetric matrix of dimension 11,333,520 x 11,333,520 
and 61,026,416 nonzeros. The matrix is obtained from a 3D Finite Volume Method (FVM) discretization 
of a MEMS device. MC73_Fiedler consumes 75.5 seconds on a single core. The speed improvement of 
TraceMin-Fiedler is given in Tabled We note that the results using single core on a node has a much more 
memory bandwidth available compared to 8 cores per node. Therefore, the speed improvement from 1 to 8 
cores (all on a single node) is not ideal. TraceMin-Fiedler is 44.2 times faster than MC73_Fiedler using 256 
cores. 

Table 3: Total time in seconds (rounded to the first decimal place) for TraceMin-Fiedler and MC73_Fiedler 
and the average number of inner PCG iterations, number of outer TraceMin iterations for TraceMin-Fiedler. 



Matrix/Cores 


TraceMin-Fiedler 

# Outer(Avg. Inner) its. 


1 


8 


16 


32 


MC73.Fiedler 
1 


rajatSl 


2(1) 


5.6s 


1.4s 


0.7s 


0.4s 


81.5s 


nipkkt 


2(30) 


100.5s 


24.9s 


15.3s 


10.8s 


83.9s 


Freescalel 


2(30) 


61.5s 


23.5s 


16.0s 


12.5s 


52.8s 


kktPower 


2(1) 


4.8s 


1.0s 


0.7s 


0.5s 


341.6s 



6 



1000 



100 



10 



rajatSI 
nlpkkt120 
Freescalel 


---X--- 


1 


1 


1 


kkt_power 


...s... 
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Figure 3: Speed improvement of TraceMin-Fiedler compared to uniprocessor MC73_Fiedler for four test 
problems. 



Table 4: Speed improvement over MC73_Fiedler (TMCJ3.Fiediei-/T). 





TraceMin-Fiedler 




MC73.Fiedler 


Matrix/Cores 


1 


8 


16 


32 


1 


rajatSl 


14.5 


59.2 


116.5 


227.5 


1.0 


nlpkkt 


0.8 


3.4 


5.5 


7.8 


1.0 


Freescalel 


0.9 


2.2 


3.3 


4.2 


1.0 


kktPower 


71.2 


332.3 


501.0 


641.4 


1.0 



7 



100 




0.1 I 1 1 1 1 1 L_ 

1 8 16 32 64 128 256 

Number of Cores (8 Cores per Node) 



Figure 4: Speed improvement of TraceMin-Fiedler compared to uniprocessor MC73_Fiedler for fvm matrix, 

TMC13.Fiedler = T5.5s 



4 Using the Fiedler vector for permuting the elements of a matrix 

One of the applications of the Fiedler vector is matrix reordering and bandwidth reduction. One can obtain 
the permutation to achieve reduction in the (weighted or nonweigted) bandwidth of the matrix by sorting 
the elements of the Fiedler vector (see HI US for details). 

In this section we propose a metric to measure the quality of the reordering, namely the relative bandweight. 
We compare the quality of the Fiedler vector using this metric. 

We define the relative bandweight of a specified band of the matrix as follows: 

Li,j:\i-j\<MiJ)\ 

In other words, the bandweight of a matrix A, with respect to an integer k, is equal to the fraction of the total 
magnitude of entries that are encapsulated in a band of half-width k. 

We randomly selected matrices with smaller dimension to be able to visualize the effect of reordering 
from the UF Sparse Matrix Collection in Table |5] The relative residuals for the Fiedler vector computed by 
both methods and the number of iterations for TraceMin-Fiedler is give in Table |6] 

In 2 cases, namely bcsstkll and cvxbqpl, out of 10, the relative residual of the Fiedler vector from 
MC73_Fiedler did not reach the stopping tolerance of 10^^. In Figures 151 and [12] . we depict the rel- 
ative bandweight comparison for these two cases and the resulting reordered matrices. In both cases 
TraceMin_Fiedler produces a better reordering. The relative residual of MC73_Fiedler (3.5 x 10^^°) is 
significantly better than TraceMin_Fiedler (2.3 x 10^^) for sparsine. However, the quality of reordering is 
better for TraceMin_Fiedler using both our bandweight metric as well as the sparsity plots of the reordered 
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Table 5: Properties of 



test matrices. 



Matrix 


n 


nnz 


application 


bcsstk22 


1 Jo 


d9d 


structral mechanics 


problem 1 


AAA 

414 


1, 1 /y 


rt/MLArS test matrix 


1 Clli_ 1 / 


1 ^57 


8 985 




c-19 


2,327 


21,817 


nonlinear optimization 


eurqsa 


7,245 


46,142 


economics 


tuma2 


12,992 


49,365 


mine model 


smt 


25,710 


3,749,582 


structral mechanics 


cvxbqpl 


50,000 


349,968 


nonlinear optimization 


sparsine 


50,000 


1,548,988 


structural optimization 


F2 


71,505 


5,294,285 


structral mechanics 



matrices. For 6 cases out of 10, TraceMin_Fiedler generated a better reordering based on the sparsity plots 
and bandweights (see Figures 1141131 112I11I |7J and |5l), while in 3 cases (see Figures [TOl [8j and ^ both 
methods produces comparable quality reorderings. Finally, for eurqsa, even though the bandweight mea- 
sure indicates the reordering is slightly better if one uses MC73_Fiedler, the sparsity plots indicate better 
clustering of large elements using TraceMin_Fiedler. 
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Table 6: Relative residuals and the approximate eigenvalue(A2). 



Matrix 


1 |oo 


TraceMin-Fiedler 
Relative Residual 


A2 


# Outer(Avg. Inner) its. 


MC73_Fiedler 
Relative Residual 




bcsstk22 


5.3 X 10*' 


4.7 X 10"" 


6.0 X 10-^ 


3(30) 


2.2 X 10-^ 


2.8 X lO'' 


problem 1 


1.7 X 10^ 


6.7 X 10"^ 


4.6 X 10 2 


3(30) 


2.7 X 10-6 


4.6 X 10-2 


raiL1357 


9.1 X 10-5 


8.2 X 10-6 


2.8 X 10-'' 


4(30) 


5.4 X 10-6 


2.9 X 10-^ 


c-19 


1.2 X 10"^ 


1.6 X 10-6 


3.8 X 10-1 


3(29) 


8.2 X 10-6 


4.0 X 10-1 


eurqsa 


1.3 X lO'^ 


5.3 X 10-^ 


9.2 X 10-1 


2(30) 


2.9 X lO-'^ 


4.3 X 10-1 


tuma2 


1.0 X 10' 


2.6 X 10-6 


8.9 X 10 4 


8(30) 


9.5 X 10-6 


8.6 X 10-^ 


smt 


1.8 X lO'' 


8.3 X 10-^ 


4.9 X 10^ 


2(30) 


5.2 X 10-6 


2.0 X 10^1 


cvxbqpl 


7.0 X 10^ 


6.2 X 10-6 


7.5 X 10° 


2(30) 


1.7 X 10-2 


9.4 X 10^ 


sparsine 


3.2 X 10^ 


2.3 X 10-6 


1.4 X 10^ 


4(23) 


3.5 X 10-1*^ 


1.0 X 10^ 


F2 


4.2 X lO'^ 


1.5 X 10-^ 


1.0 X 10^ 


3(30) 


8.8 X 10-6 


4.7 X 10^ 



0.95 



0.9 



0.85 



0.8 



0.75 



0.7 



no reordering 
mc73 reordering 
TRACEMiN-Fiedler reordering 



20 



40 



60 



80 



100 



120 



140 



(a) bandweight 




Figure 5: Sparsity plots of bcsstkll; red and blue indicates the largest and the smallest elements, respec- 
tively, in the sparsity plots 
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(b) original matrix 



(c) MC73 Jiedler 



(d) TraceMin-Fiedler 



Figure 6: Sparsity plots of probleml; red and blue indicates the largest and the smallest elements, respec- 
tively, in the sparsity plots 
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no reordering 
mc73 reordering 
TRACEMiN-Fiedler reordering 
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800 
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1400 
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(c) MC73 Jiedler 



(d) TraceMin-Fiedler 



Figure 7: Sparsity plots of raiL1357; red and blue indicates the largest and the smallest elements, respec- 
tively. 
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(b) original matrix 



(c) MC73_Fiedler 



(d) TraceMin-Fiedler 



Figure 8: Sparsity plots of c-19; red and blue indicates the largest and the smallest elements, respectively. 
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1000 2000 3000 4000 5000 6000 7000 8000 
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(b) original matrix (c) MC73_Fiedler (d) TraceMin-Fiedler 

gure 9: Sparsity plots of eurqsa; red and blue indicates the largest and the smallest elements, respectively. 
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5 Conclusions 



We have presented a new algorithm for computing the Fiedler vector on parallel computing platforms, and 
have shown its effectiveness compared to the well-known scheme given by routine MC73JFiedler of the 
Harwell Subroutine Library for computing the Fiedler vector of four large sparse matrices. The scalability of 
the method was demonstrated for a matrix of dimension 1 1 million on a cluster. Finally, we have compared 
the quality of the reodering produced from the Fiedler vector for a variety of matrices from the UF sparse 
matrix collection and proposed the the bandweight as metric to measure the quaUty of the reordering. 

Acknowledgments 

The author is grateful to Professors Ahmed Sameh and Ananth Grama for their useful comments and sup- 
ports, as well as introducing the idea of weighted reordering. The author also thanks Dr. Faisal Saied 
for providing an implementation of the trace minimization method and Eric Cox for his suggestions for 
improving the presentation of this paper. 

References 

[1] Stephen T. Barnard, Alex Pothen, and Horst Simon. A spectral algorithm for envelope reduction of 
sparse matrices. Numerical Linear Algebra with Applications, 2(4):317-334, 1995. 

[2] T. A. Davis. University of Florida sparse matrix collection. NA Digest, 1997. 

[3] M. Fiedler. Algebraic connectivity of graphs. Czechoslovak Mathematical Journal, 23(2):298-305, 
1973. 

[4] Xiaofeng He, Hongyuan Zha, Chris H.Q. Ding, and Horst D. Simon. Web document clustering using 
hyperlink structures. Computational Statistics & Data Analysis, 41(1): 19 - 45, 2002. 

[5] Desmond J. Higham, Gabriela Kalna, and Milla Kibble. Spectral clustering and its use in bioinfor- 
matics. Journal of Computational and Applied Mathematics, 204(1):25 - 37, 2007. Special issue 
dedicated to Professor Shinnosuke Oharu on the occasion of his 65th birthday. 

[6] HSL. A collection of Fortran codes for large-scale scientific computation, 2004. See 
http://www.cse.scitech.ac.uk/nag/hsl/. 

[7] Y.F. Hu and J.A. Scott. HSL_MC73: a fast multilevel Fiedler and profile reduction code. Technical 
Report RAL-TR-2003-036, 2003. 

[8] S. Kundu, D.C. Sorensen, and Jr. G.N. Phihphsi. Automatic domain decomposition of proteins by a 
gaussian network model. Proteins: Structure, Function, and Bioinformatics, 57(4):725-733, 2004. 

[9] M. Manguoglu. A parallel hybrid sparse Unear system solver. In Computational Electromagnetics 
International Workshop, 2009. C EM 2009, pages 38^3, July 2009. 

[10] M. Manguoglu, M. Koyuturk, A. Grama, and A. H. Sameh. Weighted matrix ordering and parallel 
banded preconditioners for iterative linear system solvers. SIAM Journal on Scientific Computing, 
accepted. 



16 



[11] Murat Manguoglu, Ahmed H. Sameh, and Olaf Schenk. Pspike: A parallel hybrid sparse linear system 
solver. Lecture Notes in Computer Science(Euro-Par 2009 Parallel Processing), 5704:797-808, 2009. 

[12] M.Manguoglu, F.Saied, E.Cox, and A.Sameh. http://www.cs.purdue.edu/homes/mmanguog/fiedler.html, 
2010. 

[13] Andrew Y. Ng, Michael 1. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. 
In Advances in Neural Information Processing Systems 14, pages 849-856. MIT Press, 2001. 

[14] Alex Pothen, Horst D. Simon, and Kan-Pu Liou. Partitioning sparse matrices with eigenvectors of 
graphs. SIAM J. Matrix Anal. Appl, ll(3):430-452, 1990. 

[15] Huaijun Qiu and Edwin R. Hancock. Graph matching and clustering using spectral partitions. Pattern 
Recognition, 39(1):22 - 34, 2006. 

[16] Ahmed Sameh and Zhanye Tong. The trace minimization method for the symmetric generaUzed eigen- 
value problem. J. Comput. Appl. Math., 123(1-2): 155-175, 2000. 

[17] Ahmed H. Sameh and John A. Wisniewski. A trace minimization algorithm for the generaUzed eigen- 
value problem. SIAM Journal on Numerical Analysis, 19(6): 1243-1259, 1982. 

[18] S. J. Shepherd, C. B. Beggs, and S. Jones. Amino acid partitioning using a fiedler vector model. 
Journal European Biophysics Journal, 37(1): 105-109, 2007. 



17 





0.9 -1 




0.8 




0.7 


ight 


0.6 


0) 




g 




"O 




c: 

CO 


0.5 - 


X) 


o 




> 




"S 


0.4 - 






0.3 - 




0.2 - 




0.1 - 




- 



r 



2000 



no reordering 
mc73 reordering 
TRACEMIN-Fledler reordering 



4000 



6000 8000 
k 



10000 



12000 



14000 



(a) bandweight 



\ 



\ 




(b) original matrix 



(c) MC73_Fiedler 



(d) TraceMin-Fiedler 



Figure 10: Sparsity plots of tumal; red and blue indicates the largest and the smallest elements, respectively. 
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Figure 1 1 : Sparsity plots of smt\ red and blue indicates the largest and the smallest elements, respectively, 
in the sparsity plots 
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Figure 12: Sparsity plots of cvxbqpl; red and blue indicates the largest and the smallest elements, respec- 
tively. 
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Figure 13: Sparsity plots of sparsine; red and blue indicates the largest and the smallest elements, respec- 
tively. 
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Figure 14: Sparsity plots of F2; red and blue indicates the largest and the smallest elements, respectively. 
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